Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Feature selection for imbalanced data based on neighborhood tolerance mutual information and whale optimization algorithm
Lin SUN, Jinxu HUANG, Jiucheng XU
Journal of Computer Applications    2023, 43 (6): 1842-1854.   DOI: 10.11772/j.issn.1001-9081.2022050691
Abstract192)   HTML6)    PDF (1713KB)(209)       Save

Aiming at the problems that most feature selection algorithms do not fully consider class non-uniform distribution of data, the correlation between features and the influence of different parameters on the feature selection results, a feature selection method for imbalanced data based on neighborhood tolerance mutual information and Whale Optimization Algorithm (WOA) was proposed. Firstly, for the binary and multi-class datasets in incomplete neighborhood decision system, two kinds of feature importances of imbalanced data were defined on the basis of the upper and lower boundary regions. Then, to fully reflect the decision-making ability of features and the correlation between features, the neighborhood tolerance mutual information was developed. Finally, by integrating the feature importance of imbalanced data and the neighborhood tolerance mutual information, a Feature Selection for Imbalanced Data based on Neighborhood tolerance mutual information (FSIDN) algorithm was designed, where the optimal parameters of feature selection algorithm were obtained by using WOA, and the nonlinear convergence factor and adaptive inertia weight were introduced to improve WOA and avoid WOA from falling into the local optimum. Experiments were conducted on 8 benchmark functions, the results show that the improved WOA has good optimization performance; and the experimental results of feature selection on 13 binary and 4 multi-class imbalanced datasets show that the proposed algorithm can effectively select the feature subsets with good classification effect compared with the other related algorithms.

Table and Figures | Reference | Related Articles | Metrics
Feature selection algorithm based on neighborhood rough set and monarch butterfly optimization
Lin SUN, Jing ZHAO, Jiucheng XU, Xinya WANG
Journal of Computer Applications    2022, 42 (5): 1355-1366.   DOI: 10.11772/j.issn.1001-9081.2021030497
Abstract289)   HTML9)    PDF (1375KB)(85)       Save

The classical Monarch Butterfly Optimization (MBO) algorithm cannot handle continuous data well, and the rough set model cannot sufficiently process large-scale, high-dimensional and complex data. To address these problems, a new feature selection algorithm based on Neighborhood Rough Set (NRS) and MBO was proposed. Firstly, local disturbance, group division strategy and MBO algorithm were combined, and a transmission mechanism was constructed to form a Binary MBO (BMBO) algorithm. Secondly, the mutation operator was introduced to enhance the exploration ability of this algorithm, and a BMBO based on Mutation operator (BMBOM) algorithm was proposed. Then, a fitness function was developed based on the neighborhood dependence degree in NRS, and the fitness values of the initialized feature subsets were evaluated and sorted. Finally, the BMBOM algorithm was used to search the optimal feature subset through continuous iterations, and a meta-heuristic feature selection algorithm was designed. The optimization performance of the BMBOM algorithm was evaluated on benchmark functions, and the classification performance of the proposed feature selection algorithm was evaluated on UCI datasets. Experimental results show that, the proposed BMBOM algorithm is significantly better than MBO and Particle Swarm Optimization (PSO) algorithms in terms of the optimal value, worst value, average value and standard deviation on five benchmark functions. Compared with the optimized feature selection algorithms based on rough set, the feature selection algorithms combining rough set and optimization algorithms, the feature selection algorithms combining NRS and optimization algorithms, the feature selection algorithms based on binary grey wolf optimization, the proposed feature selection algorithm performs well in the three indicators of classification accuracy, the number of selected features and fitness value on UCI datasets, and can select the optimal feature subset with few features and high classification accuracy.

Table and Figures | Reference | Related Articles | Metrics